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Abstract: To correct spectral peak drift and obtain more reliable net counts, this study proposes a long-short 
memory (LSTM) model fused with a convolutional neural network (CNN) to accurately estimate the relevant 
parameters of a nuclear pulse signal by learning of samples. A predefined mathematical model was used to train 
the CNN-LSTM model and generate a dataset composed of distorted pulse sequences. The trained model was 
validated using simulated pulses. The relative errors in the amplitude estimation of pulse sequences with different 
degrees of distortion were obtained using triangular shaping, CNN-LSTM, and LSTM models. As a result, for 
severely distorted pulses, the relative error of the CNN-LSTM model in estimating the pulse parameters was 
reduced by 14.35% compared with that of the triangular shaping algorithm. For slightly distorted pulses, the 
relative error of the CNN-LSTM model was reduced by 0.33% compared with that of the triangular shaping 
algorithm. The model was then evaluated considering two performance indicators, the correction ratio and the 
efficiency ratio, which represent the proportion of the increase in peak area of the two characteristic peak regions 
of interest (ROIs) to the peak area of the corrected characteristic peak ROI and the proportion of the increase in 
peak area of the two characteristic peak ROIs to the peak areas of the two shadow peak ROI, respectively. Ten 
measurement results of the iron ore samples indicate that approximately 86.27% of the decreased peak area of 
the shadow peak ROI was corrected to the characteristic peak ROI, and the proportion of the corrected peak area 
to the peak area of the characteristic peak ROI was approximately 1.72%. The proposed CNN-LSTM model can 
be applied to X-ray energy spectrum correction, which is of great significance for X-ray spectroscopy and 
elemental content analyses. 
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1. Introduction 


X-ray fluorescence refers to the X-rays emitted by a sample under irradiation by an excitation source, which 
contains the elemental and chemical composition information of the analyzed sample. In X-ray fluorescence 
(XRF) spectrometry, the counting rate and energy resolution are important indicators that directly determine the 
accuracy of the content analysis of each element in the tested sample [1]. In particular, in the detection of weak 
elements with lower contents, peak drift and count loss have an inestimable impact on the measurement results. 
The main causes of peak drift and count loss are pulse distortions caused by the measurement system itself. The 
key elements of the measurement system include a probe (integrated with the detector and preamplifier), X-ray 
tube, tested sample, front-end signal conditioning circuit, digital processing unit, controller unit, and upper 


computer [2]. Distorted pulses primarily include stacked, interfering, slow, spark, double, and truncated pulses. 
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In measurement systems using switch reset preamplifiers, distorted pulses are mainly composed of truncated 
pulses, which refer to a pulse signal whose pulse amplitude suddenly jumps to zero owing to the reset of the 
switch, resulting in an insufficient effective width. As a result, the amplitude loss of the triangular shaping results 
caused by pulse distortion has led to some limitations in current X-ray fluorescence spectroscopy, including 
spectral peak drift, unreliable net counts, and inaccurate element content analysis. 

In the field of X-ray fluorescence spectroscopy, research has mainly focused on digital pulse shaping [3] and 
filtering [4], and an increasing number of researchers are focusing on using new digital signal processing 
technologies to solve problems in this field. The algorithm proposed by Zhong [5] solved the problem of poor 
resolution of the energy spectrum caused by pulse stacking and temperature fluctuations in an X-ray spectrum 
system. A symmetric conversion method based on Gaussian distribution [6] was proposed to obtain the y-ray net count 
from the interlaced overlap peak in the HPGe y ray spectrometer system. A modified sparse reconstruction method 
[7] to overcome pulse pile-up, especially with ultrahigh count rates, which uses two regularization terms to 
compensate for the error caused by an inadequate sampling rate. To achieve high count rates, a new true Gaussian 
digital shaper for detector pulses [8] and a compensation technology for pulse stacking [9] were proposed. In our 
previous research, we proposed a pulse elimination method [10] and pulse repair method [11] for distorted pulses, 
both of which improved the accuracy of spectral analysis to a certain extent. As traditional pulse processing 
methods, the above research methods have obvious optimization effects on the X-ray fluorescence spectrum 
analysis in situations where pulse stacking or pulse distortion is not particularly serious. However, traditional 
pulse processing methods are significantly limited when pulse stacking or pulse distortion is difficult to recognize, 
and this has become a popular research topic in the field of spectrum processing. 

In recent years, deep learning technology has developed rapidly, and many excellent models have emerged, 
such as UNet [12], VNet [13], and U2Net [14], which have been widely used in medicine [15], industry [16], 
control [17], and radiation measurements, such as imaging quality improvement in radiation therapy [18], gamma 
spectrum analysis [19], and pulse signal analysis based on residual structures [20]. Deep learning technology 
provides various ideas for pulse processing in radiation measurement [21,22]. Touch [23] applied an artificial 
neural network for energy-spectrum correction and achieved satisfactory results. Alberto et al. [24] proposed a 
specific type of U-net that filters pulses, returns their height, and estimates the pulse amplitude. Byoungil et al. 
[25] proposed a deep learning-based method for separating and predicting the true pulse height of a signal for 
application in spectroscopy with a scintillation detector. Liu [26] investigated a pulse-coupled neural network 
(PCNN) for higher anti-noise performance in the neutron and gamma-ray (n—y) discrimination field. Ma [27,28] 
accurately predicted the trapezoidal-forming parameters of stacked pulses using a long short-term memory 
(LSTM) model. Based on previous research on pulse amplitude estimation, this paper proposes a methodology 
for an LSTM model fused with a convolutional neural network (CNN). Compared to other pulse estimation 
algorithms, this algorithm exhibited better performance. The introduction and extension of this new technology 


to X-ray fluorescence spectroscopy are of significant interest. 


2. Principle and method 


2.1 Principle of peak drift 


The X-ray fluorescence spectrum is often analyzed using multichannel pulse amplitude (MCA), with each 
pulse amplitude corresponding to a count in the counting histogram. When the pulse output of the measurement 


system experienced an amplitude loss during the digital processing stage, the corresponding counting histogram 
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of the pulse drifted to the left. When the number of pulses is sufficient, they exist as shadow peaks of the 
characteristic peaks in the generated energy spectrum. The traditional spectrum acquisition process is shown in 
Fig. la and includes a detector, a preamplifier, a CR differential shaper, a digital processing unit, and an MCA 
unit. For standard sources with a single-element composition, the number of characteristic peaks is limited; 
therefore, the pulse amplitude of the measured output mostly fluctuates within a certain range. Taking 2000 pulses 
with an amplitude of approximately 600mv as an example and assuming a pulse distortion ratio of 5%, the X-ray 
spectrum obtained using the spectral analysis method shown in Fig. la is shown in Fig. 1b. In this figure, the 
channel range of the characteristic peak region of interest (ROI) is 594—606 with a peak area(net peak count) of 
1900. A shadow peak formed by the distorted pulses appeared near the 520th channel on the left side of the 
characteristic peak, with a peak area of 100, as shown by the green shaded area in Fig. 1b. In practical applications, 
if the counting rate of the characteristic peak is high, the number of distorted pulses accumulates to a certain 
extent and a shadow peak is generated. This shadow peak not only reduces the net count of the characteristic peak 
ROI but also introduces new difficulties to spectral analysis. Therefore, this study proposes a deep learning based 
CNN-LSTM model that is added before the MCA unit to achieve an accurate estimation of pulse parameters. The 
spectral acquisition process for the added model is shown in Fig. 1c. In an ideal situation, when the amplitudes 
of 100 distorted pulses are accurately estimated, the histogram of the characteristic peaks obtained by calling the 
model is shown in Fig. 1d, where the ROI of the characteristic peaks remains unchanged but the total count 
increases to 2000. It can be concluded that calling the model to optimize the pulse amplitude estimation not only 


ensures that the counting of characteristic peaks is not lost, but also eliminates shadow peaks caused by pulse 


distortion. 
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Fig. 1 Principle of peak drift and correction. a Spectrum acquisition process of the traditional method, b The X-ray spectrum by 
the traditional method, ¢ Spectral acquisition process of the CNN-LSTM model, d The histogram of the characteristic peaks by 


calling the CNN-LSTM model. 


2.2 Deep learning 


2.2.1 Data acquisition 


In the data acquisition stage, the datasets were produced. For a single distorted negative exponential pulse, 
the pulse distortion time is assumed to be tjymp, and its mathematical model is given by Eq. (1), where A 
represents the amplitude of the negative exponential pulse, t represents attenuation time constant, Tork 


represents sampling period: 


Teik 
Ax exp(-t+ = ) t < tjump (1) 


0 , t> tjump 


V(t) = 


The mathematical model of a pulse sequence composed of N distortion negative exponential pulses is: 


S EJ (2) 


V(t) = [ue —T;)A;e T 
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Here, u(t) represents unit-step signal, A; is amplitude coefficient of the 7” nuclear pulse, T; is occurrence 
p p sg i p P 
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time of the i” nuclear pulse, t represents time constant. The negative exponential pulse sequence after 


discretization can be expressed as 
N 
(kT cik-Ti) 
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(3) 


The distorted nuclear pulse sequence V,(nT,;;,) used for the parameter estimation is regardedas N distorted 
negative exponential pulse sequences V.(AT,,;,) obtained after triangular shaping, and its mathematical model is 


as follows: 
1 
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Here, the dataset of the CNN-LSTM model established in this study was taken from the pulse amplitude 
sampling value of distortion pulses after triangular shaping, whereas the parameter set P was taken from the 


* negative exponential 


negative exponential pulses before triangular shaping. Taking the parameter set P; of thei 

pulse as an example, this set includes the amplitude A;(i = 1,2,...,N), sampling period T,,,, time constant T 

of the negative exponential pulse and the rising time tup of the triangular shaping. The matrix representation of 
the dataset is as follows: 

Voal Wo(2*Tadli e Vota) Pa 

: s, : (5) 

Vo(Taudln Wo * Talu e Von * Tarly Pu 

The datasets in Eqs. (5) contains N triangular-shaped pulse sequences, and the amplitude of each pulse 

corresponds to a row in the matrix. Each row contains n + 1 columns. The first n columns correspond to each 

amplitude value of the triangular shaping result of distortion pulses, and the last column represents the parameter 

set of the sequence, including Ai, Teir, T and tup. The structures of the generated datasets are shown in Fig. 2. 

As required, this study divided the dataset into training, test, and validation sets in a ratio of 7:2:1. In general, 

the training set accounts for a large proportion of the data and is used to train the generalization ability of the 

model, whereas the verification set is used to verify whether the model is overfitted. Once overfitting occurs, it 


must be eliminated by adding a dropout layer to randomly discard the connections of some neurons. 
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Fig. 2 Generated datasets 


2.2.2 Hyperparameter optimization 


The model adopted in this study addresses a series of nuclear pulse amplitude sequences. As a special 
recurrent neural network (RNN), LSTM is essentially different from an RNN in that the introduction of forgetting 
gates determines which information will be retained or forgotten by controlling the parameters. Therefore, LSTM 
can solve the problems of gradient disappearance and explosion of long time-series samples during the training 
process. 

LSTM usually deals with long sequences and large sample data, but too large sample data will cause some 
difficulties during model training, such as computation complexity. Therefore, this study combined LSTM wSith 
a CNN. A convolutional neural network (CNN) is not as much of an algorithm as a feature extraction method, 
and usually includes a convolution layer (including an activation layer), a pooling layer, and a full link layer. The 
process of extracting features through convolutional neural network is essentially the process of solving the 


optimal parameter matrix. The relationship between input and output can be expressed using Eqs. (6): 
y = f(%,W) = Wx +b (6) 


where W represents the weight parameter matrix, X represents the input neuron matrix, as shown in Eq. 
(6), and b represents the offset term. The structures and parameter settings of the CNN model are listed in Table 
1. By setting multiple convolution and pooling layers, the CNN greatly reduced the number of samples in the 
dataset while completing feature extraction. The CNN model used in this study included two one-dimensional 
convolution layers and two pooling layers. The time step for each layer was set to one. The first convolution layer 
contained 64 convolution cores and output 64 eigenvectors, whereas the second contained 16 convolution kernels 
and output 16 eigenvectors. The size of the convolution core in both convolution layers is set to 3*3, the moving 
step is 1, the filling strategy is "same,” and the activation function is "relu.” The pooling layer adopts 
MaxPooling1D, which does not change the input signal size. 

Table 1 CNN architecture details 


Block Layer(filter size) Input size Output size 
Conv1D_1 Conv1D (3*3) (None, 1, 256) (None, 1, 64) 
Max_pooling1D_1 Max_pooling1D (None, 1, 64) (None, 1, 64) 
Conv1D_2 Conv1D (3*3) (None, 1, 64) (None, 1, 16) 
Max_pooling1D_2 Max_pooling1D (None, 1, 16) (None, 1, 16) 


In the forward propagation process, the input neurons are kept unchanged, and the weight parameter matrix 
is initialized using a random strategy. The error between the output parameter set P; in the forward-propagation 
process and the actual pulse parameter set P; in the training set can be calculated using the loss function. The 
calculation method is expressed in Eqs. (7), where the actual parameter set of the i” sample is represented by P;, 
and Pi the parameter set estimated by the forward propagation is represented by P;. For the training set with N 
samples, the mean square error (MSE) of the parameter set was considered as the function value of the loss 


function, which is represented by Luse- 
N 
1 1 
Luse = = (Pi — PI)? (7) 
i=1 


Subsequently, the back-propagation through time (BPTT) algorithm is applied to feed back the gradient of 
the loss function and Lysg to the CNN-LSTM network to update the weight matrix W, so as to reduce the error 
in subsequent iterations. To prevent a gradient explosion, this model sets the gradient clipping parameter clipnorm 
to 1 and clipvalue to 0.5. 

Fig. 3 shows the hyperparameter optimization process for the parameters and layers during the training 
process of the CNN-LSTM model. If the batchsize is set too large, it can lead to memory overflow during the 
training process, and the model is prone to convergence to local optima, making it impossible to complete the 
training. If the batchsize is too small, the rate of convergence of the model will be too slow, and the training time 
will be too long. Fig. 3 shows the iterative loss values obtained for the training and validation sets when the 
number of layers of the LSTM model was five and the parameter batch_size was set to 10 and 100, respectively. 
When the batchsize was 100, the model converged at the 40th epoch with a high loss of 3 x 105. When batch_size 


is 10, the model converges normally, and the loss value after convergence approaches zero as much as possible. 
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Fig. 3 Process of the hyperparameter optimization 
When setting the parameters for the LSTM model, theoretically speaking, the more layers there are, the more 
ideal the training results. The problem of vanishing gradient must also be considered. An increase in the number 
of layers results in a greater computational burden; therefore, when optimizing the hyperparameters, we usually 
set the number of layers to 3-6. Fig. 3 shows the iterative loss values obtained for the training and validation sets 


for layers 3 and 5. It can be seen that when the batchsize is 10, the attenuation speed of the LSTM model with 


five and three layers is close, but when the layer number is 3, the loss value after model convergence is still as 
high as 2.8* 10°. When the number of layers was five, the loss values of the training and validation sets 
approached zero. 

After hyperparameter optimization, five LSTM layers were set with an initial learning rate of 0.0001 and a 
batchsize of 10, and Adam was selected as the optimizer. The generated network model structure is shown in Fig. 
4, which includes the input layer, hidden layer, output layer, and backpropagation part. The hidden layer includes 
the CNN and LSTM models. 
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Fig. 4 Network model structure of CNN-LSTM. The detailed calculation procedure of the LSTM unit can be found in the article of 
Graves et al.[29] 


3. Simulation results and experimental verification 


The CNN-LSTM model proposed in this study was applied to the peak correction of the X-ray fluorescence 


spectrum. As mentioned above, when the negative exponential pulse sequence output of the measurement system 
was significantly distorted, the amplitude value of the triangular shaping result was significantly damaged. 
According to the generation principle of the digital multichannel spectrum, the amplitude loss of the distorted 
pulses after shaping appears in the form of count drift in the X-ray fluorescence spectrum, which is unfavorable 
for obtaining an accurate X-ray fluorescence spectrum. The CNN-LSTM model proposed in this study, based on 
deep learning, aims to accurately estimate the parameters of the triangular shaping results of the distorted pulses. 
Thus, the shift in the peak in the X-ray fluorescence spectrum can be corrected to obtain a more accurate X-ray 


fluorescence spectrum. 


3.1 CNN-LSTM simulation 


3.1.1 Model training 


To verify the effect of the CNN-LSTM model on the parameter estimation of the triangular shaping results 
under the condition of severe distortion of the negative exponential pulses, we took 10000 samples and divided 
them into training, verification, and test sets according to a ratio of 7:2:1, with a training period of 100 epochs. 
The change in the loss values obtained from the training and verification sets during the training process is shown 
in Fig. 5. 

In general, with an increase in the number of training cycles, the loss values of the training and verification 
sets showed a downward trend. The verification set experiences some shocks in the later period, but soon tends 
to stabilize. Using the loss function, when the loss values of both the training and validation sets were low and 
relatively stable, the model during that period was saved as the best model. In the training process of the proposed 
model, the model at the 91st epoch was saved as the best model, with loss values of 2.7895 and 3.5507 for the 


training and validation sets, respectively, during this epoch. 
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Fig. 5 Iterative graph of loss and accuracy on training and validation sets during model training 


3.1.2 Performance evaluation of parameter estimation 


In the production of the test set, considering that the distortion degree of the pulses may affect the model test 
effect, the sample of the test set can be divided into two categories according to the pulse distortion time 
mentioned above: slightly distorted pulses and severely distorted pulses, whose triangular shaping results are 
shown in Fig.6. The rising time of the triangular shape, tup, is bound. If the time of pulse distortion, tjymp, 18 
before the tup, the amplitude of the shaping result decreases significantly, as shown in Fig. 6, and then such 


pulses are marked as severely distorted pulses. On the other hand, if the time of pulse distortion is after tup 
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(including the critical time t,,), the amplitude of the shaping result decreases slightly, as shown in Fig. 6; 


therefore, this type of pulse is marked as a slightly distorted pulse. 
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Fig. 6 Triangular shaping results of negative exponential pulse at different distortion moments 


To control the impact of other variables, the amplitude parameter A; of Pulse1-Pulse8 is fixed at 2000mv, 
the time interval T; of adjacent distorted pulses was 500T,;;,. The sampling period Tıg, is 5Ons, and the rising 
time of the triangular shaping was 100 Tox. Consider two pulse sequences and their triangular shaping results for 


analysis, as shown in Fig. 7 and 8. 
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Fig. 7 Triangular shaping results of the seriously distorted pulses. a Seriously distorted pulses, b Triangular shaping results 


In Fig. 7a, the amplitude value of Pulse1 suddenly changes to 0 at the 47th T,,, and the amplitude value of 
Pulse2 suddenly changes to 0 at the 60th Torz. The amplitude value of Pulse3 suddenly changes to zero at the 
75th Tex and the amplitude value of Pulse4 suddenly changes to zero at the 90th Tok. As mentioned above, the 
rising time of the triangular shaping remains at 100 Tıg, that is, Pulsel-Pulse 4 are distorted before the peak 
value of the triangular shaping. Therefore, the amplitude values of the shaping results exhibited a significant loss 


compared to the original pulses, as shown in Fig.7b. 
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Fig. 8 Triangular shaping results of the Slightly Distorted Pulses. a Slightly distorted pulses, b Triangular shaping results 
In Fig. 8a, the amplitude value of Pulse5 suddenly changes to zero at the 100th T,), and the amplitude value 


of Pulse6 suddenly changes to zero at the 140th Teix. The amplitude value of Pulse7 suddenly changes to zero at 
the 220th T,., and the amplitude value of Pulse8 suddenly changes to zero at the 280th Tox. In Fig. 8a, the 
common feature of Pulse5-Pulse8 is that the distortion time is after the peak value of the triangular shaping. 
Therefore, the amplitude values of the shaping results have no large losses compared to the original pulses, as 
shown in Fig. 8b. 

During the performance evaluation of the parameter estimation, the CNN-LSTM and LSTM models were 
used to estimate the parameters of the above two pulse sequences, and the output results are shown in Fig. 9. The 
real value of the eight pulse amplitudes was fixed at 2000 mV, but the amplitude loss of the triangular shaping 


result of the Pulse 1-Pulse 4 was very large. 
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Fig. 9 Amplitude Comparison Chart of the Distorted Pulses 


Based on the parameter estimation results of the slightly and severely distorted pulses, the absolute and 
relative errors of the different methods used to estimate the pulse amplitudes are summarized in Table 2. 


Table 2 Comparison of estimated values obtained by different models 


Seriously distorted pulses Slightly distorted pulses 

Pulsel Pulse2 Pulse3 Pulse4 Pulse5 Pulse6  Pulse7 Pulse’ 

Areal MV 2000 2000 2000 2000 2000 2000 2000 2000 
Arri/mv 1459.7 1629 1820 1940 1998.4 2016.4 2008.6 2004.4 

Arri 540.3 371 180 60 1.6 16.4 8.6 4.4 
ÔTri 27.02% 18.55% 9.00% 3.00% 0.08% 0.82% 0.43% 0.22% 
AcnN-LSTM/MY 1995 1998 1999 2003 2000 2001 2002 2002 
ACNN-LsTM 5 2 1 3 0 1 2 2 


8 ence 0.25% 0.10% 0.05% 0.15% 0.00% 0.05% 0.10% 0.10% 
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Arsry/mv 1990 1993 2003 2001 2003 2005 2002 2003 
Neco 10 7 3 1 3 5 2 3 
ÖisTM 0.50% 0.35% 0.15% 0.05% 0.15% 0.25% 0.10% 0.15% 


Area represents the actual pulse amplitude. The measurement methods used in this study mainly included 
triangular shaping, CNN-LSTM, and LSTM models, the measurement results of which are represented by 
Atri, Acnn—-tstm> and Ajsry. This study used relative error indicators to evaluate the parameter estimation 
performance of each algorithm. Let 6 represents the relative error, and A represents the absolute error, whose 


calculation process is given in Eqs. (8), (9), (10). 


Arri ABS(A — Apri 
Bev =o + 100% = Eren = Arvid , 100% (8) 
real real 
TR — ÂCNN-LSTM * 100% = ABS (Areal B Acnn-usrm) * 100% (9) 
“ISEM Areal Areal 
A ABS(A —A 
sru = Z" » 100% = ABS(Areat Aistu] * 100% (10) 
Areal Areal 


For severely distorted pulses, the average relative error of triangular shaping was as high as 14.39%, whereas 
that of CNN-LSTM was only 0.14%, and that of LSTM was 1.05%. On the other hand, for slightly distorted 
pulses, the average relative error of triangular shaping was 0.39%, while that of CNN-LSTM was only 0.06%, 
and that of LSTM was 0.65%. It can also be observed from the estimation results that the two models can estimate 
the pulse amplitude very accurately, whether it is a severely or slightly distorted pulse. It is worth noting that 
although the performance of the CNN-LSTM model is slightly better than that of LSTM, both deep learning 
methods have very high accuracy in pulse parameter estimation. This study introduces a CNN because the input 
pulse sequence is relatively complex, and directly uses the LSTM model to process data that are too large. 


Therefore, using a CNN for sampling reduces the amount of data and saves computational resources. 
3.2 Experimental verification 


In the model simulation, we used two types of distortion pulses to train, verify, and test the CNN-LSTM 
model and achieved good test results in the parameter estimation of distortion-negative exponential pulses. To 
further verify the optimization effect of the parameter estimation on the X-ray fluorescence spectrum, an iron ore 
sample was selected as the measurement object in the experimental verification link. In previous studies [11], we 
identified that the elemental components with high content in this sample were Fe, Sr, and Sn. The measurement 
system included a high-performance silicon drift detector(FAST-SDD) and KYW2000A X-ray tube. The effective 
detection area of the detector is 25 mm?, detector thickness is 500um, and the thickness of beryllium window is 
0.5 mil. The rated tube voltage was 50KV and the rated tube current was 0-1mA. The ADC sampling frequency 
was 20 MHz and the sampling period was 50ns. 

The original measured spectra are shown in Fig. 10. It is easy to see that the element with the highest content 
in the measured full spectrum is Sr element. According to the principle of distortion-pulse generation, elements 
with higher counting rates have a higher probability of generating shadowed peaks. Therefore, we selected Sr 
with the highest content and considered the net count of the two characteristic peaks of that element and their 
corresponding shadow peaks in the ROI as the analysis object. There is an unknown peak on the left side of the 
Sr characteristic peaks. If such shadow peaks are not processed, they may be mistaken as the characteristic peaks 
of some elements. Unreliable net counts in the ROI directly affect the elemental content analysis. Therefore, it is 


necessary to correct the shadow peaks in X-ray fluorescence spectra. 
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Fig. 10 Measurement Spectrum of Iron Ore Samples 


In this study, the optimization of the X-ray fluorescence spectrum was implemented using the CNN-LSTM 
model based on deep learning to correct the count drift caused by distorted pulses. The negative exponential pulse 
sequence from the measurement process and its triangular shaping results were saved to create the test set. To 
create the test set, it is necessary to preprocess the negative exponential pulse sequence. To match the trained 
CNN-LSTM model, the preprocessing unit primarily completes the discrimination and separation of distorted 
pulses. For experimental verification, qualitative analysis of the X-ray fluorescence spectrum optimization and 


quantitative analysis of the spectrum peak correction were completed. 
3.2.1 Qualitative analysis 


The amplitude of the distorted pulses estimated by the CNN-LSTM model was used to replace the original 
pulse amplitude, and a comparison diagram of the X-ray fluorescence spectrum is shown in Fig. 11. The red 
spectral lines represent the results after the peak correction. By magnifying the local characteristics of the shadow 
peak area in the logarithmic coordinate system, it can be observed that there is a weak peak to the left of the two 
characteristic peaks of strontium in Fig. 11. Because the chemical symbol of Sr is Sr, its two characteristic peaks 
are represented by Sr-1 and Sr-2. According to the principle of multichannel spectroscopy, the amplitude loss of 
triangular shaping results in a left shift in the counts (also known as the left shift of the peak position), and the 
left-shifted counts form a new shadow peak on the left side of the characteristic peak. In Fig. 11, the left shift of 
the two characteristic peaks of strontium forms the shadow peaks 1 and 2. After the parameter estimation of the 
distorted pulses using the CNN-LSTM model, the left shift of the peak position was effectively corrected, and the 
shadow peak was eliminated. 

Qualitative analysis of the X-ray fluorescence spectrum before and after optimization showed that the CNN- 
LSTM model trained in this study could effectively correct the shadow peak caused by pulse distortion and 


optimize the X-ray fluorescence spectrum analysis results. 
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Fig. 11 Correction effect of shadow peak of strontium element characteristic peak 
3.2.2 Quantitative analysis 


As mentioned previously, the amplitude loss of shaping results in a left shift in the peak position. A shadow 
of the characteristic peak was formed on the left side of the characteristic peak. Here, the peak area S represents 
the sum of the counts of a certain channel address interval. Sshadow represents the area of the shadow peak 
formed by the left shift of the first characteristic peak of Sr, and Syeaxioss indicates the area loss of the first 
characteristic peak of strontium owing to the left shift of the peak position, whose computational processes are 
shown in Eqs. (11), (12), (13) and (14). 


960 960 
Sshadow1 = >, Count;_origin = ` Counti-correctea (11) 
i=896 i=896 
1024 1024 
Speakloss1 = > Count ;_corrected = >. Count;_origin (12) 
i=960 i=960 


The region of interest (ROI) of shadow peak 1 is 896-960, so Sshadow1 is numerically equal to the 
difference between the peak area of the channel address interval where the shadow peak is located before and 
after spectral peak correction, as shown in Eq. (11), and Sshadow1ı is shown in the shaded area of Fig. 12a. 

Speaklossı Tepresents the corrected peak area loss in the ROI of the first characteristic Sr peak of strontium 
element after calling the model. The ROI of the selected characteristic peak was located in the channel address 
interval of 960-1024, as shown in the shaded area of Fig. 12b. Speaxioss1 18 numerically equal to the difference 
in the peak area of the second characteristic peak ROI before and after the spectral peak correction, as shown in 
Eq. (12). 
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Fig. 12 Spectral comparison of different ROI before and after correction. a ROI of shadow1, b ROI of Sr-1, c ROI of shadow2, d 
ROI of Sr-2. 


Sshadow2 Tepresents the area of the shadow peak formed by the left shift of the second characteristic peak 
of strontium. The ROI of shadow peak 2 is 1024-1088, so Sshadow2 is numerically equal to the difference in the 
peak area of the shadow peak ROI before and after spectral peak correction, as shown in Eq. (13); and Sshadow2 
is shown in the shaded area in Fig. 12c. 
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Sshadow2 T > Count;_origin = ` Count;_ corrected (13) 
i=1024 i=1024 
1152 1152 
Speakloss2 = ` Counti-corrected Z > Counti-origin (14) 
i=1088 i=1088 


Speakloss2 Tepresents the corrected peak area loss of the second characteristic peak of Sr element after calling 
the model. The ROI of the selected characteristic peak was located in the channel address interval of 1088-1152, 
as shown by the shaded area in Fig. 12d. Speaxioss2 18 numerically equal to the difference in the peak area of the 
second characteristic peak ROI before and after the spectral peak correction, as shown in Eq. (14). 

To quantify the correction effect of the CNN-LSTM model on the X-ray spectra of the measured iron ore 
samples, two indicators, the correction ratio R, and the efficiency ratio Re, were introduced. The correction 
ratio represents the proportion of the increment of the peak area of the two characteristic peak ROI after calling 
the model to the peak area of the corrected characteristic peak ROI, and the efficiency ratio represents the 
proportion of the increment of the peak area of the two characteristic peak ROI after calling the model to the peak 
area of the two shadow peak ROI. The calculation formulas are given by Eqs. (15) and (16), respectively. Ten 
measurements were performed on the iron ore samples, and the measurement results were analyzed, as shown in 
Table 3. 


Speakloss1 + Speakloss2 


R: = Sear OOOO + 100% (15) 
F DE Counti-corrected + peer Count;_correctea 
S +S 
Re 2 peakloss1 peakloss2 * 100% ( 16) 


Sshadow2 + Sshadow2 


From the comparison of the measurement results, it can be seen that the corrected X-ray spectrum obtained 
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using the CNN-LSTM model to predict the pulse height has two typical features. First, the peak area of the 
characteristic peak ROI was improved compared with the original spectrum, and the standard deviation of the 
multiple measurement results was also significantly reduced. Second, the peak area of the shadow peak area was 
significantly reduced. According to the two characteristics above and the energy conservation theorem, it can be 
inferred that the peak area reduced in the shadow peak area should theoretically be corrected to the characteristic 
peak ROI, and the correction effect can be evaluated by R, defined above. Table 3 shows that approximately 
86.27% of the peak area reduced in the shadow peak area can be corrected to the characteristic peak ROI, and the 
proportion of the corrected peak area to the peak area of the characteristic peak ROI is approximately 1.72%, 


which is of great significance for X-ray spectroscopy and elemental content analyses. 


Table 3 Details of measurement results 


. Original spectrum Corrected spectrum 
Times Re Re 
Sshadow1 Ssr-1 Sshadow2 Ssr-2 Sshadow1 Ssr-1 Sshadow2 Ssr—2 


126.91 4231.21 68.29 822.07 48.58 4304.88 52.51 837.99 1.74% 95.20% 
127.26 4289.77 76.46 855.27 51.32 4341.12 49.44 877.24 1.41% 71.21% 
121.42 4271.21 68.42 831.42 53.39 4317.29 55.14 844.31 1.14% 72.52% 
121.67 4199.54 72.51 819.33 44.25 4286.53 41.25 824.59 1.80% 84.88% 
127.36 4229.79 76.79 827.93 49.87 4314.18 45.87 841.02 1.89% 89.92% 
131.48 4262.17 73.87 831.85 45.66 4351.29 43.73 847.34 2.01% 90.21% 
127.75 4281.21 70.41 846.83 49.32 4358.91 46.83 846.88 1.49% 76.22% 
124.61 4209.39 76.82 820.44 55.38 4291.27 52.31 826.72 1.72% 94.05% 
134.59 4191.29 75.11 790.12 42.26 4276.72 38.22 832.29 2.50% 98.75% 
10 117.75 4213.71 71.33 811.58 47.11 4277.11 51.61 824.28 1.49% 84.22% 
Avg 126.08 4237.93 73.00 825.68 48.71 4311.93 47.69 840.27 1.72% 86.27% 
STD 4.70 33.79 3.15 17.16 3.85 28.66 5.19 14.94 - - 
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4. Conclusion 


In this study, we trained a CNN-LSTM model for peak correction using X-ray fluorescence spectroscopy. 
The model processed randomly generated distorted pulses. To improve training efficiency, the model was divided 
into two parts: feature extraction was performed on the data using a CNN, and pulse amplitude estimation was 
performed using the LSTM network. In the simulation, the relative errors in the amplitude estimation of pulse 
sequences with different degrees of distortion were obtained using triangular shaping, CNN-LSTM, and LSTM 
models. As a result, for severely distorted pulses, the relative error of the CNN-LSTM model in estimating the 
pulse parameters was reduced by 14.35% compared to that of the triangular shaping algorithm; for slightly 
distorted pulses, the relative error of the CNN-LSTM model was reduced by 0.33%. 

During the experiment, FAST SDD was used to perform X-ray measurements on the iron ore samples. The 
measured pulse sequence was saved offline as the model input, and the pulse amplitude output of the model was 
analyzed for multichannel pulse height, resulting in an X-ray energy spectrum corrected for shadow peaks. 
Meanwhile, the original energy spectrum obtained without calling the model was used as a reference spectrum 
for comparison with the corrected spectrum. The results indicate that the proposed model successfully predicts 
the heights of the measured pulse sequences. To further validate the performance of the model for shadow peak 
correction, ten measurements of iron ore samples showed that the peak area of the shadow peak ROI decreased 
by approximately 86.27%, which can be corrected to the characteristic peak ROI, and the corrected peak area 
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accounted for approximately 1.72% of the characteristic peak ROI. This is of great significance for X-ray 


spectroscopy and elemental analysis. 
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